59 research outputs found

    Gammatonegram Representation for End-to-End Dysarthric Speech Processing Tasks: Speech Recognition, Speaker Identification, and Intelligibility Assessment

    Full text link
    Dysarthria is a disability that causes a disturbance in the human speech system and reduces the quality and intelligibility of a person's speech. Because of this effect, the normal speech processing systems can not work properly on impaired speech. This disability is usually associated with physical disabilities. Therefore, designing a system that can perform some tasks by receiving voice commands in the smart home can be a significant achievement. In this work, we introduce gammatonegram as an effective method to represent audio files with discriminative details, which is used as input for the convolutional neural network. On the other word, we convert each speech file into an image and propose image recognition system to classify speech in different scenarios. Proposed CNN is based on the transfer learning method on the pre-trained Alexnet. In this research, the efficiency of the proposed system for speech recognition, speaker identification, and intelligibility assessment is evaluated. According to the results on the UA dataset, the proposed speech recognition system achieved 91.29% accuracy in speaker-dependent mode, the speaker identification system acquired 87.74% accuracy in text-dependent mode, and the intelligibility assessment system achieved 96.47% accuracy in two-class mode. Finally, we propose a multi-network speech recognition system that works fully automatically. This system is located in a cascade arrangement with the two-class intelligibility assessment system, and the output of this system activates each one of the speech recognition networks. This architecture achieves an accuracy of 92.3% WRR. The source code of this paper is available.Comment: 12 pages, 8 figure

    Scalable and Language-Independent Embedding-based Approach for Plagiarism Detection Considering Obfuscation Type: No Training Phase

    Full text link
    [EN] The efficiency and scalability of plagiarism detection systems have become a major challenge due to the vast amount of available textual data in several languages over the Internet. Plagiarism occurs in different levels of obfuscation, ranging from the exact copy of original materials to text summarization. Consequently, designed algorithms to detect plagiarism should be robust to the diverse languages and different type of obfuscation in plagiarism cases. In this paper, we employ text embedding vectors to compare similarity among documents to detect plagiarism. Word vectors are combined by a simple aggregation function to represent a text document. This representation comprises semantic and syntactic information of the text and leads to efficient text alignment among suspicious and original documents. By comparing representations of sentences in source and suspicious documents, pair sentences with the highest similarity are considered as the candidates or seeds of plagiarism cases. To filter and merge these seeds, a set of parameters, including Jaccard similarity and merging threshold, are tuned by two different approaches: offline tuning and online tuning. The offline method, which is used as the benchmark, regulates a unique set of parameters for all types of plagiarism by several trials on the training corpus. Experiments show improvements in performance by considering obfuscation type during threshold tuning. In this regard, our proposed online approach uses two statistical methods to filter outlier candidates automatically by their scale of obfuscation. By employing the online tuning approach, no distinct training dataset is required to train the system. We applied our proposed method on available datasets in English, Persian and Arabic languages on the text alignment task to evaluate the robustness of the proposed methods from the language perspective as well. As our experimental results confirm, our efficient approach can achieve considerable performance on the different datasets in various languages. Our online threshold tuning approach without any training datasets works as well as, or even in some cases better than, the training-base method.The work of Paolo Rosso was partially funded by the Spanish MICINN under the research Project MISMIS-FAKEn-HATE on Misinformation and Miscommunication in social media: FAKE news and HATE speech (PGC2018-096212-B-C31).Gharavi, E.; Veisi, H.; Rosso, P. (2020). Scalable and Language-Independent Embedding-based Approach for Plagiarism Detection Considering Obfuscation Type: No Training Phase. Neural Computing and Applications. 32(14):10593-10607. https://doi.org/10.1007/s00521-019-04594-yS1059310607321

    Central Kurdish Sentiment Analysis Using Deep Learning

    Get PDF
    Sentiment Analysis (SA) as a type of opinion mining and as a more general topic than polarity detection, is widely used for analyzing user's reviews or comments of online expressions, which is implemented using various techniques among which the Artificial Neural Network (ANN) is the most popular one. This paper addresses the development of an SA system for the Central Kurdish language (CKB) using deep learning. Increasing the efficiency and strengthening of the SA system relies on a robust language model. In addition, for creating and training a robust language model, collecting a large amount of text corpus is required and we have created a corpus of size 300 million tokens for CKB. Also, to train the SA model, we collected 14,881 comments on Facebook, then they are labeled manually. The combination of Word2Vec for the language model and Long Short-Term Memory (LSTM) for the classifier are used to create an SA model on the CKB SA dataset. These deep learning-based techniques are the most well-known methods in this field which have received high performance in SA for various languages. The performance of the proposed method for 3 classes SA is %71.35 accuracy. This result is superior to the best-reported result for CKB

    Fuzzy fractional-order sliding mode control of COVID-19 virus variants

    Get PDF
    These days, one of the biggest challenges in the world is dealing with the outbreak of the Covid 19 virus. Recently, new variants of this virus have been identified that have a much higher rate of transmission. To effectively control and manage the spread of the disease, a clear understanding of its transmission dynamics and effective control techniques to reduce or inhibit the spread of the virus is necessary. Although vaccine production and distribution are currently underway, Non-Pharmacological Interventions (NPI) continue to be an important and fundamental strategy for controlling the spread of the virus in various countries around the world. In this paper, Covid 19 dynamics is modeled using four well-known categories (SEIR): Susceptible-Exposed-Infected-Recovered. Since the parameters of the model have uncertainty, a robust control method should be designed. In this paper, using fractional calculus and fuzzy logic, a robust fuzzy fractional-order sliding mode controller (FOFSMC) for Covid 19 dynamics is proposed, which aims to control the prevalence of the disease using NPI. The proposed method is implemented both on the integer and fractional order model. Finally, the performance of the proposed controller on the new variant of the Covid 19 virus with a faster disease transmission rate will be evaluated
    • …
    corecore